Caption Text Recognition in Video Frames by MAP Matching

نویسندگان

Akira Nakamura

Kazuhiko Yamamoto

چکیده

In this paper, an approach to detection of caption text in video frames is described. Text recognition in video can be applied to various applications, however there are still problematic issues such as insufficient resolution, complexity of layouts and backgrounds. This study attempts to solve these problems with a segmentation-free approach, called MAP matching method. Besides extending the method to grayscale images, a strategy for character size variation using Gaussian filtering and multi-sized reference patterns is discussed, as well as a method for detecting frames containing caption text. Results show the proposed matching method is able to detect characters of unknown size in caption text. Although over-detection is not negligible, verifying the positions of detected characters can identify the location of keywords with practical precision. It is also shown that the frames containing caption text are detected with nearly 98% accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visualizing Multimedia Content on Paper Documents: Components of Key Frame Selection for Video Paper

The components of a key frame selection algorithm for a paper-based multimedia browsing interface called Video Paper are described. Analysis of video image frames is combined with the results of processing the closed caption to select key frames that are printed on a paper document together with the closed caption. Bar codes positioned near the key frames allow a user to play the video from the...

متن کامل

Content Based Image and Video Retrieval Using Embedded Text

Extraction of text from image and video is an important step in building efficient indexing and retrieval systems for multimedia databases. We adopt a hybrid approach for such text extraction by exploiting a number of characteristics of text blocks in color images and video frames. Our system detects both caption text as well as scene text of different font, size, color and intensity. We have d...

متن کامل

Precise News Video Text Detection/Localization Based on Multiple Frames Integration

This paper presents a multiple frames integration based approach to detect and localize static caption texts on news videos. Utilizing the temporal information of videos, the algorithm includes robust text features and the non-text line deletion technique, and yields precise and tight localization for detected text regions. The Canny edge detector is first applied on reference frames and is fol...

متن کامل

Designing caption production rules based on face, text, and motion detection

Producing off-line captions for the deaf and hearing impaired people is a labor-intensive task that can require up to 18 hours of production per hour of film. Captions are placed manually close to the region of interest but it must avoid masking human faces, texts or any moving objects that might be relevant to the story flow. Our goal is to use image processing techniques to reduce the off-lin...

متن کامل

Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study

Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Caption Text Recognition in Video Frames by MAP Matching

نویسندگان

چکیده

منابع مشابه

Visualizing Multimedia Content on Paper Documents: Components of Key Frame Selection for Video Paper

Content Based Image and Video Retrieval Using Embedded Text

Precise News Video Text Detection/Localization Based on Multiple Frames Integration

Designing caption production rules based on face, text, and motion detection

Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study

عنوان ژورنال:

اشتراک گذاری